skip to main content


Search for: All records

Creators/Authors contains: "Chew, Joyce A."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Classification and topic modeling are popular techniques in machine learning that extract information from large-scale datasets. By incorporating a priori information such as labels or important features, methods have been developed to perform classification and topic modeling tasks; however, most methods that can perform both do not allow for guidance of the topics or features. In this paper, we propose a novel method, namely Guided Semi-Supervised Non-negative Matrix Factorization (GSSNMF), that performs both classification and topic modeling by incorporating supervision from both pre-assigned document class labels and user-designed seed words. We test the performance of this method on legal documents provided by the California Innocence Project and the 20 Newsgroups dataset. Our results show that the proposed method improves both classification accuracy and topic coherence in comparison to past methods such as Semi-Supervised Non-negative Matrix Factorization (SSNMF), Guided Non-negative Matrix Factorization (Guided NMF), and Topic Supervised NMF. 
    more » « less
  2. Abstract

    Computation of binding constants from spectrophotometric titration data is a very popular application of chemometric hard modeling. However, the calculated values are misleading if the correct binding model is not used. Given that many supramolecular systems of interest feature unknown speciation, a priori determination of binding stoichiometry constitutes an important unsolved problem in chemometrics. We present a new and reliable algorithm for accomplishing this task, implemented using a hybrid particle swarm optimization technique. Simultaneous optimization of stoichiometry ratios and binding constants allows the optimal binding model to be calculated in just a few minutes for systems with up to four reactions. Simulated data studies demonstrate that the algorithm finds the correct stoichiometry with up to nine reactions in the absence of noise, including accurately determining species with unusual stoichiometry, such as H2G5. Application to four experimental datasets shows the algorithm is robust to experimental errors for a variety of chemical systems and binding models. This algorithm will facilitate the discovery of complex binding models, increase efficiency in titration analysis, and avert incorrect stoichiometry models, thereby improving the reliability of binding constant information in spectrophotometric titrations.

     
    more » « less